Constraining Dark Energy by Combining Cluster Counts and 
Shear— Shear Correlations in a Weak Lensing Survey 
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We study the potential of a large future weak-lensing survey to constrain dark energy properties by 
using both the number counts of detected galaxy clusters (sensitive primarily to density fluctuations 
on small scales) and tomographic shear-shear correlations (restricted to large scales). We use the 
Fisher matrix formalism, assume a flat universe and parameterize the equation of state of dark 
energy by w(a) = Wq + w a (l — a), to forecast the expected statistical errors from either observable, 
and from their combination. We show that the covariance between these two observables is small, 
and argue that therefore they can be regarded as independent constraints. We find that when the 
number counts and the shear-shear correlations (on angular scales £ < 1000) are combined, an LSST 
(Large Synoptic Survey Telescope) -like survey can yield statistical errors on Ude,mo,w as tight as 
0.003, 0.03, 0.1. These values are a factor of 2 — 25 better than using either observable alone. The 
results are also about a factor of two better than those from combining number counts of galaxy 
clusters and their power spectrum. 
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I. INTRODUCTION 

The existence of dark energy is strongly indicated by 
the relatively dim appearance of distant supernovae [HU 
by the shortfall of the matter density to make the uni- 
verse spatially flat (e.g.Q), and by recent, accurate 
measurements of cosmic microwave background (CMB) 
anisotropy 0, Q . This newly discovered form of energy 
constitutes about 2/3 of the total energy density of the 
universe, and is known to have negative pressure and a 
nearly uniform spatial distribution. Several competing 
theoretical models have been proposed to explain dark 
energy (see, e.g., @ for a list of references). While cur- 
rent observational data cannot distinguish among these 
proposals, it is hoped that future observations, which will 
reach higher precision, can constrain models, and clarify 
the nature of dark energy. 

In this paper, we explore one of the methods to con- 
strain dark energy parameters in the future: using the 
weak gravitational lensing (WL) distortion of distant 
galaxies in a large survey of the sky. The light from 
distant galaxies is deflected by the foreground gravita- 
tional field, causing small but statistically coherent dis- 
tortions in the observed shapes of the galaxies. This so- 
called weak-lensing shear signal can be observed, in prin- 
ciple, for a large number of galaxies, and used to infer 
the foreground gravitational field or almost equivalently, 
the mass distribution (see recent reviews by, e.g., 0,(1]). 

Several previous studies have examined the constraints 
that can be placed on dark energy from large weak- 
lensing surveys. The most commonly proposed method 
is to utilize statistical properties of the two-dimensional 
shear field directly, such as its power spectrum (or, equiv- 
alently, the shear-shear correlation function). With ad- 
ditional photometric redshift information, which future 
wide-field multi-color surveys are expected to provide, 
the background galaxies can be grouped into different 



redshift bins. Statistics done within each bin and among 
different bins can recover some of the information from 
the third (radial) dimension. Such a "tomographic" 
study of the foreground density fluctuations provides an 
additional handle on dark energy, through the effect of 
dark energy on the recent (0 < z < 1) growth of dark 
matter perturbations and the expansion history of the 
universe (e.g. [1 M, ED, QJ, M, Q d ) • In particu- 
lar, Song & Knox [l5[ have evaluated the statistical con- 
straints from tomographic shear-shear correlations that 
are expected to be available from the Large Synoptic Sur- 
vey Telescope (LSST). 

A complementary method is to utilize statistical prop- 
erties of the peaks of the shear field. This method, 
however, does not lend itself to straightforward math- 
ematical analysis, and has been relatively much less 
well explored (e.g. [HI, [T3|) On the other hand, there is 
an increasingly better correspondence between higher- 
a shear peaks and discrete, massive virialized objects 
-galaxy clusters- in the foreground [H, [li|. To the 
extent that this correspondence can be quantified ab- 
initio in numerical simulations, one can use the shear- 
selected cluster sample, including their abundance evolu- 
tion and power spectrum, to constrain dark energy prop- 
erties (e.g. [!3,|21]). In general, the abundance evolution 
of galaxy clusters can place strong constraints on dark 
energy parameters, because it is exponentially sensitive 
to the growth rate of matter fluctuations. Several studies 
have explored constraints expected from future surveys 
that would detect clusters through their X-ray emission 
or the Sunyaev-Zeldovich effect (e.g. [22|, HH, [24], HH • An 
attractive feature of utilizing weak lensing signal to de- 
tect clusters is that it directly probes the total mass of 
the cluster. In particular, Wang et al. [2GJ have evaluated 
the statistical constraints from the galaxy cluster sample 
that is expected to be available from LSST. 

Once the data from a wide-field WL survey, such as 
LSST 26] , or a smaller pre-cursor mission, such as Pan- 
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Starr [27j has been collected, it will be logical to try to 
extract information on dark energy properties from both 
the shear-shear correlations and cluster abundance, since 
both pieces of information will be available in the same 
data-set. The goal of the present paper is to quantify the 
improvement on dark energy constraints when these two 
pieces of information are combined. 

Ideally, one would pose more ambitious questions, such 
as: what is the maximum information one can obtain 
on dark energy parameters, given the effect of dark en- 
ergy on the full non-linear shear field? In particular, it 
is not clear whether either statistic (shear-shear corre- 
lations or cluster counts), in fact, captures a significant 
fraction of the available information. Nonetheless, in this 
paper, we contend ourselves to answering the much sim- 
pler question above. Furthermore, for simplicity, we will 
use the shear-shear correlation function only on large an- 
gular scales (£ < 1000). On large scales, where den- 
sity fluctuations are in the linear regime, the correlation 
function contains all the information about the density 
field. On smaller scales, non-linear evolution introduces 
significant non-Gaussianity. We will argue below that 
once cluster counts are taken into account, the small- 
scale shear-shear correlations offer no significant addi- 
tional information. In this paper, we focus on evaluating 
constraints for a ground-based survey, such as LSST. We 
parameterize dark energy by its bulk equation of state 
w = (P) I (p) (28|, and allow w to evolve linearly with the 
cosmic scale-factor a as w(a) = wq + w a (l — a). 

This paper is organized as follows. In § [TTJ we describe 
our calculational methods, which closely follow previous 
Fisher matrix analyses, but include an additional discus- 
sion of covariance. Our results are presented in § IIII1 
and discussed, along with various caveats, in § IIVI Fi- 
nally, in § [Vl we offer our conclusions and summarize the 
implications of this work. 



II. CALCULATIONAL METHODS 

A. Cluster Number Counts 

We closely follow ref. [20], and consider a sample of 
shear-selected clusters with specifications motivated by 
LSST. In particular, we assume the sample covers the 
redshift range of 0.1 < z < 1.4, which is divided into 
26 redshift bins of equal size Az — 0.05. The expected 
number of clusters in the i th redshift bin is calculated as 



N, = AflAz 
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where z$ is the central redshift for the i th redshift bin, Afi 
is the solid angle covered by the survey, which for LSST, 
we take to be 18,000 deg 2 , d 2 V/dzdVl is the comoving 
volume element, M m j n is the detection threshold mass 
for clusters, and ^jj is the cluster mass function. 



We use the fitting formula given by Jenkins et al. [2J 
for the cluster mass function, 
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where p m is the present-day matter density, <x(M, z) is 
the amplitude of the linear matter fluctuations at redshift 
z, smoothed by a top hat window function whose scale is 
such that the enclosed mass at the mean density p m is M . 
This formula is based on the identification of dark matter 
halos as spherical regions that have a mean overdensity 
of 180 with respect to the background matter density at 
the time of identification. 

In determining the mass threshold for detecting a clus- 
ter at redshift z, M m i n (z), we follow NFW ^30j, and model 
the density profile of galaxy clusters with the self-similar 
function 



PnfwM 



(r/r s )[l + (r/r s )}< 



(3) 



where r is the radius from the cluster center, and r Sl p s 
are some characteristic radius and density. In (30j . the 
density profile is truncated at r2oo j inside which the mean 
overdensity with respect to the critical density of the uni- 
verse at redshift z is 200. The outer radius r2oo is pa- 
rameterized by the concentration parameter, defined as 



cnfw 



^200 



(4) 



and taken to be a constant cnfw = 5 in this paper. 
With the definition of the cluster mass M200 as the mass 
enclosed within r2oo, the cluster's structure (p s , r s , r2oo) 
is fully determined by -M200 and z. 

We then follow Hamana et al. [l8[ , and use a Gaussian 
window function to smooth the convergence field induced 
by the cluster, 
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where the center of the smoothing kernel is set to that of 
the cluster. Here 9 is the angular distance from the clus- 
ter center, and 0g is the size of the smoothing aperture, 
which we choose to be 1 arcmin. Assuming that along 
the line of sight, there is only one lens at redshift z (i.e., 
the cluster to be detected), the convergence can be cal- 
culated, given a distribution of the background galaxies, 
as 

= ^ S WX* Sr dz '( d n/dz')(l- X z/Xz>) (7) 



(1 + *) 



"tot 



Here is the surface density of the cluster, projected 
along the line of sight (see [18| for more details), \z is 
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the comoving radial distance to redshift z (note that we 
assume a flat universe in this paper), dn/dz is the mean 
redshift distribution of the surface number density (per 
steradian) of background galaxies, and n to t is the mean 
total surface density. 

Throughout this paper, we adopt both dn/dz and ntot 
from the ground based survey described in [l5l |. In 
particular, those authors assumed n tot = 65 arcmin -2 , 
which may be optimistic by a factor of ~two (e.g. [2l| . 
see also ref.[3l| for an extended discussion on statisti- 
cal galaxy shape measurements). In this paper, we are 
primarily interested in a relative improvement of con- 
straints when two methods are combined, rather than 
in the absolute constraints. We therefore stick to the 
optimistic values, to facilitate a comparison with earlier 
work. We assume the noise on the measured convergence 
comes from the r.m.s. intrinsic ellipticity of the back- 
ground galaxies, and is given by [32| 



< 1 
4 2Ttd%n to t' 



(8) 



Here a e is the weighted average of the r.m.s. intrinsic 
ellipticity per component of the galaxies, given by 



J °° dz(dn I 'dz)a^(z) 



n to t 



where we follow [15j, and take <J e (z) to be 



a e (z) = 0.3 + 0.07z. 



(9) 



(10) 



Finally, we set the threshold convergence K t h to be 
4.5 times the noise. Setting kq in equation ([5]) equal 
to 4.5(7 no ise yields an implicit equation for Mwo{z). To 
be consistent with the halo mass defined for the Jenkins 
et al. fitting formula for the cluster mass function, we 
extend the NFW density profile for a cluster with mass 
M2aa{z) outward to a radius so that the mean interior 
density is 180 times that of the background matter den- 
sity at redshift z. The mass enclosed within this radius 
is adopted as M m i n (z) in equation |T]). 

In the mock survey defined above (and with the fiducial 
cosmological parameters defined below), we find that the 
limiting halo mass is ~ (0.6 — 4) x 10 14 M©, depending on 
redshift. The smallest halos therefore correspond to small 
groups, containing, on average, ~(10-60) galaxies above 
the absolute r magnitude M r <, — 20 threshold for the 
Sloan Digital Sky Survey (see Figure 3 in ref.[33|]). Note 
that LSST will detect galaxies to a much greater depth, 
which may increase the number of detectable member 
galaxies (although this is unclear; see, e.g., Figure 3 in 
ref. [11] that shows no increase). 

The total number of clusters in the survey down to this 
mass threshold is iV tota i = 276, 794. This is somewhat 
larger than the number in our previous study 20]. The 
reason for the increase is that in the previous paper, we 
excluded background galaxies at 2 > 2.5, whereas here we 
include them (and both distributions were normalized to 



have the same ntot)- As a result, we find here an increase 
in the convergence produced by a given foreground galaxy 
cluster. Two additional, smaller differences are that here 
we adopt a smoothing filter size of 1 arcmin (rather than 
0.5 arcmin), and that we use a slightly higher, redshift 
dependent intrinsic ellipticity (given by eq. (|10p . rather 
than a fixed value of 0.3\/2). 



B. Shear— Shear Correlations 

Here we closely follow ref. , and consider the shear- 
shear correlation function in an LSST-like ground based 
survey. From z = to z = 3.2, we divide the back- 
ground galaxies into 8 equally-spaced redshift bins, and 
we imagine that the solid angle covered in the survey is 
probed by pixels of size of O p i x . On average, a given 
pixel will probe N pix galaxies in redshift bin with 
iVp ix = n b fl p i x , where n h is the mean surface density of 
the galaxies in this redshift bin, and is given by 



dz(dn/dz) 



(11) 



with z,- n and zt, av the edges of this redshift bin. 

Ill 1 11 IlldX O 

The detected shear from these galaxies depends both 
on the lensing shear signal and on their intrinsic elliptic- 
ity 



1 / ( 

7l,pix = 71, lens + 71, hit = ^2 ( " •> •) 



lens ^-H,int 



)l2) 



72.pix = 72, lens + 72,int 



1 ^Pix 



^xi,lcns ^xi,int 



N». ^ V 2 2 

P« i=l 



13) 



where, 71 72 are the two independent components of the 
shear field, e + e x are the two components of the galaxy 
ellipticity. As before, the intrinsic ellipticity is the only 
source of noise we consider in this paper for the shear- 
shear correlations. Furthermore, we assume that there 
is no correlation between intrinsic ellipticity of different 
galaxies. In this case, 71, mt, when averaged over large 
enough realizations of the intrinsic ellipticity of these 
ATp ix galaxies, will have a variance of 



(7L 



4(iV p b ix ) 2 Yf +M 7 

b 

f*r x dz(dn/dz){e 2 + int {z)) 



(14) 



A similar expression holds for 72,int- Following [15[ 
again, we assume that the r.m.s. intrinsic ellipticity 
varies with redshift as 



(4,mt(^)> = ( e x,mtO)) = of 



(15) 
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where cr e (z) is given by equation (|10[) . 

We assume further that there is no correlation between 
the lensing signal and noise, and expand the shear fields 
in terms of spherical harmonics. The covariance matrix 
C 77 of the expansion coefficients in the E mode can then 
be written as the sum of two matrices S 77 (signal) and 
N 77 (noise), whose elements are 



n; 



-ib,b' 
6,6' 



= N b 5t,b'Su'S mr r' 



(16) 
(17) 



where b, b 1 label redshift bins, and £, m, £', m' label spher- 
ical harmonic modes. The shear angular power spectrum 
is then given by 



c: 



66' 



Tr 2 e 



dz 



where k = l/Xz, and A|(fc, z) is the variance of the gravi- 
tational potential fluctuations per In k interval. The win- 
dow function is given by 



W\z) 



2Xz 



dz' 



dn 
~dz~< 



Xz 

Xz 



O(z'-z) (19) 



where O is the step function. The diagonal elements of 
the noise matrix N are £, m independent (although they 
depend on the redshift bin), and given by 



N b = ( 7i 2 nt >6^pix, 



(20) 



where (7i 2 nt )& refers either to (yf int ) or to (7I int ), as given 
by equation ([T4"|) . 

In the late universe, A^(k,z) is simply related to 
A 2 (k,z), the variance of matter density fluctuations per 
In k interval, by 



A|(fc,z) = 



/3f2 r , 



V 2a 



2 / TT \ 4 

ck 



A 2 (fc,z) (21) 



where VL m is the present-day matter density parameter, 
Hq is the present-day Hubble constant, a is the scale 
factor normalized to unity today, and c is the speed of 
light. 

This equation neglects fluctuations in dark energy. For 
a scalar-modeled dark energy, this is assured only for 
wavelengths shorter than the Compton wavelength of 
the scalar field, or k 3> fc p, where kQ is the Compton 
wavenumber. Ma et al. [351 ] gave a fitting formula for 
^q{z) when the scalar field equation of state parameter 
to is a constant and the universe is spatially flat, 



kn 



Ml in) 



'{1-w) 



2w - 



+ (1 - Slm)a 



— 3w 



(22) 

Since k = l/x, we require i 3> kgx for self-consistency. 
In a cosmological model close to our fiducial model (de- 
fined in the next section) but with w — > — 1, the max- 
imum of kQX is ~ 30. We follow (l5j and impose an 



upper limit on the angular scales utilized in our study, 
given by £ > 40. Note, however, that dark energy fluc- 
tuations are unlikely to be actually detectable on large 
scales, and our results are insensitive to this lower limit 
on i (see discussion below). 



C. Error Estimates 

We assume the spatial curvature of the universe 
is zero, and adopt a 7-dimensional cosmological pa- 
rameter set {^de, Q m h 2 , us, wq, w a , Qbh 2 , n s }, where 
Ode, ^m, fit are present-day energy density parameters 
of dark energy, total matter (cold dark matter+baryon), 
and baryons, respectively, h is the Hubble constant in 
units of 100 km s _1 Mpc -1 , n s is the index for the pri- 
mordial matter power spectrum, and as is the amplitude 
of the linear matter density fluctuations today smoothed 
on a scale of 8/i _1 Mpc. We consider a time-varying equa- 
tion of state parameter, given by 



w(a) = wq + w a (l — a) 



(23) 



In our fiducial model, we choose the following values of 
the 7 parameters: {0.73,0.14,0.9,-1,0,0.024,1}, which 
are consistent with the current "concordance model" [5(] . 

To estimate uncertainties on the cosmological param- 
eters obtained by a specific probe, we use the Fisher ma- 
trix formalism. The Fisher matrix is defined as, 



F, 



a/3 



d 2 hxL 



(24) 



where p a , pp represent the 7 model parameters we want 
to constrain, L is the likelihood function, the derivatives 
are evaluated at the true parameter set (which in our case 
is the adopted fiducial parameter set), and the average 
is taken over many realizations of the data set. The un- 
certainty on the parameter p a after marginalized over all 
other parameters is obtained as y/ (F~ 1 ) a ai which gives 
a lower limit on the accuracy of p a for any unbiased es- 
timator of the parameters [36j, and in the absence of 
systematic errors. 

For number counts, we assume that the counts in each 
of the 26 redshift bins are independent Poisson random 
variables, with the mean values given by equation ((T]). 
The Fisher matrix in this case is constructed as 133 , 



af3 



26 

E 



dNidNi 1 
dp a dpp Ni 



(25) 



For the shear-shear correlations, we assume that the 
spherical harmonic expansion coefficients are Gaussian 
random variables whose mean are zero, and whose co- 
variance matrix is C 77 . The Fisher matrix is then con- 
structed as 138] 



r af3 



bib2 W b 2 b 3C b,b iw b ibl 



(26) 
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where / s k y is the fraction of the sky coved by the survey 
(in our case, / s k y = 0.44), [X] t0l denotes the derivative 
of [X] with respect to p a , C\ h is calculated from equa- 
tion (TBI, and the Wf arc the elements of the inverse of 
the covariance matrix, 

W = (C 77 )- 1 (27) 
= Wf5 u ,5 mm ,. (28) 

Non-linear gravitational clustering in the late uni- 
verse will induce non-Gaussian signatures in matter 
fluctuation field. Although we incorporate these non- 
linear effects on the matter power spectrum, these non- 
linear effects will also render the likelihood function non- 
Gaussian. To avoid this complication, we neglect all 
modes with i > 1000 |g|. 

To compute the linear matter power spectrum, we use 
KINKFAST [H, a version of CMBFAST [U modified 
to accommodate a time-varying equation of state param- 
eter w, to calculate the transfer function at z = 0, and 
we obtain the linear growth function by integrating the 
differential equations given in the Appendix of ref . [22| . 
For the shear-shear correlations, the non-linear matter 
power spectrum is constructed following Smith et al. [12] ■ 
The derivatives in the Fisher matrices are calculated by 
two-sided numerical approximations. For both ^counts 
and F 11 , we chose a step-size of Aw a = ±0.01, and ±1% 
of the fiducial value for the other 6 parameters. We have 
verified directly that these step-sizes are small enough 
for the Fisher matrix entries to have converged. 

D. Covariance Between Number Counts and Shear 

Our treatments of the number counts (only shot noise 
is considered) and its combination with shear-shear cor- 
relations (covariance between these two are neglected) 
are quite simplified. To yield more accurate predictions, 
sample variance errors for the number counts, and the 
covariance between number counts and shear-shear cor- 
relations should be taken into account. 

The effect of sample variance on the constraining power 
of number counts has been considered in detail in pre- 
vious work [HI, In particular, ref. [H[ finds (see 
their Figure 2) that the degradation on dark energy con- 
straints, when the sample variance error is added to the 
Poisson error, depends mostly on the mass threshold. At 
the lower end of the range of our fiducial mass thresh- 
olds, ~ (0.6 — 4) x 10 14 M©, the degradation is a fac- 
tor of ~ 2, while at the upper end, there is negligible 
degradation (see their Figure 7). However, these degra- 
dations are overestimates, because ref. [43[ excluded the 
"signal" that arises from the cosmology dependence of 
the sample variance. When this information is included, 
sample variance errors should cause a smaller degrada- 
tion in the number count constraints (and possibly even 
improvement, if the survey is sub-divided into many an- 
gular cells, as in HU). 



For our purposes, the more important question is 
whether the covariance between number counts and 
shear-shear correlations is significant. The potential con- 
cern is that the shear-shear correlations probe the same 
realization of the density field as the cluster counts, and 
therefore simply adding the constraints from the two ob- 
servables may overestimate their combined constraining 
power. Indeed, in the hypothetical limit that the mean 
cluster abundance in a particular direction and redshift 
bin is fully predictable, given measurements of the shear 
correlations, the cluster counts should not yield any new 
information on dark energy. However, we show here that 
the covariance is very small, and the two observables can 
safely be regarded as independent. In this section, we 
summarize the results of the covariance calculation; the 
interested reader is encouraged to consult the Appendix 
for details. 

In the general case, the Fisher matrix, defined in equa- 
tion ([24]) . involves the expectation value of the deriva- 
tives of the joint likelihood function L = L(x, p) where 
x is a vector of the observables, and p is a vector of 
the model parameters. In our case, x contains the num- 
ber counts in 26 redshift bins, {Nf, 1 < i < 26}, and the 
spherical expansion coefficients in 8 different redshift bins 
{a\ m ,l < B < 8,41 < t < 1000, -(£+1) < m e < (£+1)} 
(note that we have 8,002,560 + 26 = 8,002,586 observ- 
ables) . Here both AT* and a\ m are random variables, and 
in the discussion below, the probability distribution of 
both are taken to be determined by large-scale density 
fluctuations alone. In practice, the measurement of ei- 
ther quantity represents a discrete sampling of a con- 
tinuous random field, and will therefore have an addi- 
tional sampling error. In particular, equation (|25p as- 
sumes that Poisson errors dominate the sample variance 
errors, whereas equation (|26p incorporates the additional 
stochastic noise from the distribution if intrinsic shapes. 
However, such sampling errors should be uncorrelated 
and will be ignored below. Note that excluding truly un- 
correlated errors is conservative, since they would reduce 
the cross-correlation coefficient defined below fea lBT))) . 

Under the assumption that the full joint likelihood 
function is Gaussian, the Fisher matrix depends only 
on the mean x and the covariance matrix C = ((x — 
x)(x — x) T ), and on the derivatives of these quanti- 
ties with respect to the model parameters p [36] (note 
that in general, x and C both depend on p). In our 
case, the full covariance matrix C contains the terms 
((Ni-Ni)(Nj-Nj)) and (a b em a b e '* n ,) , which describe the 
sample variance in the number counts and in the shear 
field, respectively. 1 The cross-terms, ((JVj — Ni)a b m ), 
describe the covariance between number counts and the 
shear field. Here Ni is the mean number of clusters given 



Note that eq. (126 I t depends on Cg and its derivatives, rather than 
a( m . This dependence arises from taking the expectation value 
C^[l/(2£ + l)]<£+l_ £ K m | 2 ). 
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FIG. 1: Absolute values of the cross-correlation coefficients 
(defined in eq |30l in the text) between the number of galaxy 
clusters in redshift range of (0.1 < z < 0.15) and the spherical 
harmonic expansion coefficients (at m—0) of the shear map 
from source galaxies in redshift range of (0 < z < 0.4). The 
figure shows that the cross-correlation coefficients are of order 
10~ 3 or smaller for £ > 40. 



in equation ([T]) , while we have a\ 



j, lrn — 0, and the averages 
are taken over many realization, or a large volume. 

In the Appendix, we calculate the cross-terms explic- 
itly, and show that they are given by a simple expression, 



((JV 4 - N t )4 m ) = S m0 



3tt 2 » m ff 2 Xz ,(l + ^) 
2 c 2 £ 3 



Q t N l b{z l )W b {z l )^ 2 {k = — ,zi). (29) 

Here Qe is the spherical harmonic transform of an az- 
imuthally symmetric angular window function, W b is the 
lensing window function (given in equation ll9[) . and b(z) 
is the mean bias factor of the cluster counts (averaged 
over clusters above the detection threshold). The above 
result assumes that the cluster number counts trace the 
matter density field with the linear bias factor b, and 
we have also used the Limber approximation. The lat- 
ter assumption should be justified for the angular modes 
we use (£ > 40). Note that the cross-term vanishes for 
clusters behind the source galaxies (since W h (zi) = for 
Zi > z^ lax ) , and also for m/0 (since our survey window 
is azimuthally symmetric). 

We define the cross-correlation coefficient between the 
as 



Ni and the a h im 



6,6 



(30) 



As an example, here we calculate the cross-correlation 
coefficient for the number counts and the shear field in 
their lowest respective redshift bins, |^i,i£o I - The results 



FIG. 2: Contributions from different £-modes to the sample 
variance for the number of galaxy clusters in the first redshift 
bin (0.1 < 2 < 0.15), containing, on average, 7515 clusters. 
The figure shows that the sample variance is dominated by 
the largest angular scales. This explains the lack of cross- 
correlation between the counts and the shear field, when the 
latter is restricted to smaller angular scales (£ > 40). 



are shown, as a function of £, in Figure [T] The figure 
shows that the |£i :uo| are small - of order 10~ 3 or less 
- for £ > 40. For a different pair of redshifts, i and 
B, we expect this order of magnitude would not change 
significantly, since the related quantities vary slowly with 
redshift. 

The fact that the are small can be explained 

by the following reasoning. For a given survey window, 
the variance in the cluster counts alone is dominated by 
the largest angular modes, due to cancellations among 
smaller-scale fluctuations along the direction transverse 
to the line of sight. In Figure [H we explicitly show the 
contributions to the sample variance of the cluster counts 
in the first redshift bin for LSST (note that we do not use 
the Limber approximation for this calculation). Because 
of the large angular size of the window, the figure clearly 
shows that the variance is small for modes with £ <; 10. 
Fluctuations in the underlying (isotropic) matter density 
field on different scales £ are uncorrelated, and we have 
limited the range of angular modes of the shear maps 
to £ > 40. Since these relatively smaller-scale modes 
contribute little to the fluctuations in the number counts, 
we indeed expect the £i,w m to be small. 

According to these results, the probability of drawing 



a set of A,- and 



"i.m 



is given by a multi-variate Gaus- 



sian, with the total covariance matrix that consists of 
four blocks, 



gcounts £)cross 
cross \T g77 



(C 



(31) 
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where S counts and S 77 are the sample variance matri- 
ces for counts and shear alone. When |£i &£ m | <C 1, this 
matrix can be well approximated (for example, for the 
purpose of taking its inverse) by 



Ct-nt — 









§77 



(32) 



A full treatment would incorporate the Poisson errors 
for the counts, and the shape-errors for the shear, how- 
ever, these effects are relatively small, and could only 
decrease the value of £. The Fisher matrix also requires 
taking the derivatives with respect to the cosmological 
parameters. As long as these derivatives do not have a 
strong scale-dependence, and given the smallness of £, we 
expect our basic conclusion to carry over to the Fisher 
matrix. Therefore, the cluster number counts and shear- 
shear correlations can be treated as independent probes, 
even if they probe the same area of the sky. For the 
results below, we therefore simply add the two Fisher 
matrices, when we combine the two obscrvables. 



TABLE I: This table contains our main results. Marginal- 
ized errors are shown on cosmological parameters from cluster 
counts, shear-shear correlations, and their combination. The 
parameter rj, shown in the 5 th column, measures the synergy 
between the two observable, with rj = 1 indicating no synergy, 
and lower values indicating significant degeneracy-breaking 
(see text for definition). In the 6 th and 7 th columns, we use 
the linear power spectrum for the shear (indicated here, and 
in the other tables below, by the superscript "1"). Priors 
from WMAP, AQ b h 2 = 0.0010, An B = 0.040, are included for 
the 2 nd , 4 th , and 7 th columns here, and in columns involving 
number counts in all other tables below (except Table HV)| . 





counts 


77 


counts+77 


V 


77 ! 


counts+77 ! 


Afi D E 


0.064 


0.0084 


0.0026 


0.097 


0.013 


0.0027 


AQmh 2 


0.20 


0.049 


0.0061 


0.017 


0.034 


0.0050 


Ao-8 


0.029 


0.012 


0.0031 


0.083 


0.021 


0.0031 


Awo 


0.080 


0.078 


0.033 


0.34 


0.12 


0.031 


AWa 


1.24 


0.28 


0.11 


0.18 


0.42 


0.099 


AQ b h 2 


0.0010 


0.014 


0.00099 


0.99 


0.010 


0.00099 


An s 


0.040 


0.050 


0.016 


0.26 


0.027 


0.011 



III. RESULTS 

The marginalized errors on the seven cosmological pa- 
rameters from the number counts and the shear-shear 
correlations are listed in the 2 nd and 3 rd columns of Ta- 
ble U for our fiducial LSST-like survey. Note that the 
results scale simply as AST -1 / 2 for a survey with a dif- 
ferent solid angle coverage. The 4 th column shows the 
result from combining the two observable, assuming that 
they are independent, so that the two Fisher matrices 
can simply be added (see the discussion of the covari- 
ance between these two probes in § III Dl above) . In the 
limit that the two observable have the same degeneracies 
between parameters, their combination would be equiva- 
lent to simply adding the marginalized errors in quadra- 
ture. In the 5 th column of Table |H we show a "comple- 
mentarity" parameter, rj, which quantifies the effect of 
degeneracy-breaking that occurs when the two methods 
are combined. The parameter r\ is defined as the improve- 
ment of the constraints beyond adding the two results in 
quadrature, 



T) = (A P c ° nnts+ ^) 2 



1 



1 



(Accounts) 2 (/± p r 



H\2 



(33) 



With this definition, rj — I corresponds to no degeneracy 
breaking, and lower values of rj indicate larger benefits 
from the combination. The 6 th and 7 th columns are the 
same as the 3 rd and 4 th , except that, as an academic ex- 
ercise, for the shear-shear correlations, we use the linear 
matter power spectrum instead of the nonlinear one (see 
the next section for a detailed discussion). 

The first conclusion to draw from Table Q] is that the 
shear-shear correlations give tighter constraints on all 
cosmological parameters than cluster counts alone, espe- 
cially on S7a, ^mh 2 , o~8 and w a . This may not be surpris- 
ing, given that number counts effectively measure only 26 



numbers, while the shear power spectrum is effectively a 
measurement of many more parameters. On the other 
hand, cluster counts deliver a constraint comparable to 
that from the shear for wq. We also note that constraints 
from cluster counts alone on w a are weak (as noted by 
[20| . this can be significantly improved by adding the 
cluster power spectrum and CMB anisotropy as observ- 
able). 2 

More importantly, Table [T] shows that when the two 
methods are combined, the constraints on the cosmolog- 
ical parameters improve significantly. For most of the 
parameters, i] is small, indicating significant complemen- 
tarity. In particular, focusing on the three dark-energy 
parameters, the combination tightens the constraints by 
a factor of 3 — 10 more than simply adding the marginal- 
ized errors in quadrature. The combined constraint on 
Ode is ~ 25 times better than that from counts and ~3 
times better than that from 77; the combined constraint 
on wq is ~ 2 times better than either from counts or 77; 
and the combined constraint on w a is ~ 11 times better 
than from counts and ~ 3 times better than from 77. 
These results are also shown graphically in Figures and 

SI 

For reference, we follow the recommendation of the 
Dark Energy Task Force (DETF) [45[, and compute the 



2 We note for reference that our shear correlation constraints 
are consistent with a slightly updated version of the results in 
ref. |15H . Our number— count— alone constraints on f^DE an d w a , 
on the other hand, are significantly weaker than in ref. |2d|. We 
have found that the discrepancy is due to inaccurate interpola- 
tion in ref. |2(J to obtain the mass limit. We have also found, 
however, that the inaccuracies do not significantly alter the joint 
constraints when the number counts are combined with other 
observables. 



8 




-1.06 -1.04 -1.02 -1.00 -0.E 



-0.96 -0.94 -0.92 -0.90 



TABLE II: This table recast our results in terms of a pivot 
point. The pivot point is defined as the scale factor at which 
the equation of state w p = w(a p ) is best constrained. The last 
row shows the figure-of-merit proposed by the DETF [451 ]. 





counts 


77 


counts+77 




0.047 


0.37 


0.37 


1 — a v 


0.045 


0.27 


0.27 


Aw p 


0.057 


0.020 


0.010 




1.24 


0.28 


0.11 


(AWpAWa)- 1 


14.1 


178.6 


909.1 



DETF for a "stage IF WL shear experiment alone. 



FIG. 3: Marginalized constraints in the (Ode, wo) plane for 
an LSST-like survey from the shear-selected cluster counts, 
the shear-shear correlations, and the combination of these 
two observables. Note that the cluster counts alone deliver 
weaker constraints, but still improve the wo errors from the 
shear-shear correlations by a factor of ~two, as a result of 
breaking degeneracies. 



IV. DISCUSSION 

In this section, we discuss several aspects of our basic 
simple results presented in the previous section. In par- 
ticular, we explain the reasons for the synergy between 
the two observable, and discuss various possible caveats 
that could modify our conclusions. 



-0.90 

-0.92 

-0.94 - 

-0.96 

-0.98 

-1.00 

-1.02 

-1.04- 

-1.06 

-1.08 

-1.10 




FIG. 4: Marginalized constraints in the (wo, w a ) plane for an 
LSST-like survey from the shear-selected cluster counts, the 
shear-shear correlations, and the combination of these two 
observables. 



"pivot point" -a p , i.e. the value of the scale factor where 
w(a) is best constrained. In Table [TH we list the values 
of the scale factor and redshift of this pivot point, as 
well as the errors on w(a p ) and the "figure of merit" 
defined by the DETF, (Au^Au^) -1 . The figure of merit 
we find for the individual observable is in-between the 
"optimistic" and "pessimistic" predictions by the DETF. 
The combined figure of merit, however, is significantly 
better than the most optimistic figure of merit in the 



A. Degeneracy Breaking by the Two Observables 

Table IIIII lists the uncertainties on the 7 cosmologi- 
cal parameters when fixing all the other 6 parameters at 
their fiducial value. By comparing Table UTTl with TableQ] 
we see that, using either observable alone, degeneracies 
among the 7 parameters lead to great degradation on 
the constraints. Clearly, combining these two observables 
can decrease the effect of this degradation by breaking de- 
generacies. To understand this better, we find the worst- 
constrained directions in the 7D parameter space for both 
of the observables. We do this by diagonalizing the Fisher 
matrix, and finding the eigenvector that corresponds to 
the smallest eigenvalue, i.e. whose direction is the one 
along which the probe is least sensitive at and hence 
constrained worst. For the number counts, we find that 
the (Ode, O m /i 2 , as,wo,w a , flbh 2 , n s )-components of the 
unit vector pointing in this direction are (0.050, 0.15, 
0.023, -0.044, 0.99, 0, -0.00016), with the correspond- 
ing eigenvalue of 0.63. This implies that constraints on 
(Ode, w , w a ) can not be better than (0.063, 0.056, 1.24). 
For shear-shear correlations, the direction of the worst 
degeneracy is (0.026, 0.084, 0.038, -0.26, 0.95, 0.022, - 
0.10), with a corresponding eigenvalue of 12. Again, this 
implies that constraints on (Ode, wo,w a ) can not be bet- 
ter than (0.0077, 0.076, 0.28). In both cases, we find that 
the 2 nd worst eigenvalue is much greater that the worst - 
by a factor of ~ 5 2 for the shear-shear correlations, and 
by a factor of 17 2 for the number counts. Apparently the 
7-dimensional error ellipsoid is very narrow, with a large 
extension in one direction that nearly (but not exactly) 
coincides with the w„ axis. 
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TABLE III: This table shows the effect of marginalization. 
The parameter errors are shown from the two observables, 
as m Table [TJ but before marginalization (i.e. assuming the 
other 6 parameters are fixed). 





counts 


77 


A^DE 


0.00036 


0.00041 


A£l m h 2 


0.0013 


0.0015 


Ao-g 


0.00042 


0.00064 


Awo 


0.0054 


0.0060 


Aw a 


0.029 


0.024 


AQ b h 2 


0.00050 


0.00060 


An s 


0.0034 


0.0044 



TABLE IV: This table shows the effect of the WMAP priors. 
Marginalized errors are shown from cluster counts, and the 
combination of the counts and the shear-shear correlations, 
as in the 2 nd and 3 rd columns of Table U except that we 
exclude the WMAP priors. 





counts nopriors 


(counts+ 77 ) nopriors 


ASIde 


0.15 


0.0027 


Afl m h 2 


3.92 


0.045 


Acrg 


0.067 


0.0034 


A wo 


0.20 


0.037 


Aw„ 


1.40 


0.14 


AQ b h 2 


1.46 


0.013 


An s 


5.04 


0.044 



Given the very severe degeneracies, and the fact that 
they do not point in the same direction for the two ob- 
servables, it is not surprising that the combination of the 
two observables leads to a tightening of the constraints 
beyond adding the marginalized errors in quadrature. 

Finally, since we have added WMAP priors for the 
number counts constraints, it is useful to ask how impor- 
tant these priors were for the combined errors. In Ta- 
ble [IV] we show the constraints from the cluster counts, 
and the combination of the counts and the shear -shear 



TABLE V: This table shows the effect of including shear mea- 
surements on small angular scales. Marginalized errors are 
shown on the cosmological parameters, as in Table U The 
difference from Table U is that we have used additional small- 
scales for the shear-shear correlations, by increasing the cutoff 
from ^ max = 1000 to ^ max = 3000. 





counts 


77 


counts+77 


V 


77* 


counts+77* 


A^DE 


0.064 


0.0046 


0.0023 


0.25 


0.012 


0.0025 


AQ. m h 2 


0.20 


0.034 


0.0050 


0.023 


0.021 


0.0039 


Aa 8 


0.029 


0.0060 


0.0027 


0.21 


0.019 


0.0029 


Awo 


0.080 


0.046 


0.025 


0.38 


0.11 


0.028 


Aw a 


1.24 


0.15 


0.079 


0.27 


0.39 


0.084 


AQ. b h 2 


0.0010 


0.010 


0.00099 


1.0 


0.0072 


0.00099 


An a 


0.040 


0.026 


0.0097 


0.19 


0.011 


0.0054 



correlations, as in the 2 nd and 3 rd columns of Table [H 
except that we exclude the WMAP priors. As Table ITVl 
shows, the clusters counts are insensitive to Q b h 2 and n s , 
and degeneracies with these parameters also degrade the 
constraints on other parameters. However, the shear- 
shear correlation provides a sufficiently accurate mea- 
surement of Q,bh? and n s , and the WMAP priors are not 
important for the combined constraints. 



B. Information from Small— Scale Shear— Shear 
Correlations and Non— Linearities 



In the above results, we have imposed a small-scale 
cutoff, £ max = 1000 for the shear-shear correlations. It 
is possible, however, at least in principle, to obtain ac- 
curate non-linear shear power spectra, using numerical 
simulations on a large grid of cosmological parameters. 
It is therefore interesting to ask whether including higher 
£ modes could improve the final results significantly. To 
answer this question, we repeated all the calculations in 
Table [l] but this time with £ max = 3000. The results are 
listed in Table El 

First, we notice from either Table Q] or Table IVl that 
constraints from the shear-shear correlations alone are 
better when the non-linear power spectrum is used than 
those from the linear version. This suggests that there 
is extra information in the shear-shear correlations that 
comes from the non-linear effects. Furthermore, compar- 
ing Table U to Table [V] we see that the improvement on 
dark energy constraints from these non-linearities is only 
about 50% for £ max = 1000, but increases to a factor of 
2 - 3 for £ max = 3000. However, Tables U and |V] both 
show that once the cluster count information is added, 
the nonlinear effects on the shear-shear power spectrum 
become essentially irrelevant, even for £ max = 3000. 

This indicates that while non-linear effects change the 
shear-shear power spectrum, the information content 
of these changes is sub-dominant compared with that 
probed by number counts, at least for the non-linearities 
contained in modes with £ < 3000 (note that clusters 
are strongly nonlinear objects). Given these results, we 
are satisfied to neglect the higher £ modes and stick to 
our original choice of £ < 1000. The fact that higher £ 
modes (3000 >£> 1000) help little (~ 10%) on the lin- 
ear shear-shear correlations, but help more (~ 50%) with 
the nonlinear shear-shear correlations, together with the 
fact that adding higher £ modes help little (30% at most) 
in either case when the number counts are included, is 
again an indication that the nonlinear evolution infor- 
mation in these modes is sub-dominant compared to the 
information contained in number counts. 
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TABLE VI: This table shows the effect of excluding shear mea- 
surements on the largest scales. Marginalized errors on the 
cosmological parameters, as in Table [I] The difference from 
Table U is that we have neglected large-scales for the shear- 
shear correlations, by increasing the cutoff from £ mln = 41 to 
^min = 100. As the comparison of the two tables show, the 
constraints do not degrade significantly, implying that most of 
the information is at relatively small angular scales (£ > 100). 





counts 


77 


counts+77 


V 


77 ! 


counts+77 ! 


AOde 


0.064 


0.0087 


0.0026 


0.094 


0.014 


0.0027 


AQ. m h 2 


0.20 


0.051 


0.0075 


0.023 


0.036 


0.0060 


A<7 8 


0.029 


0.012 


0.0032 


0.081 


0.022 


0.0031 


Awo 


0.080 


0.081 


0.034 


0.35 


0.13 


0.032 


Aw a 


1.24 


0.29 


0.12 


0.18 


0.45 


0.10 


AQ. b h 2 


0.0010 


0.015 


0.0010 


1.0 


0.012 


0.00099 


An s 


0.040 


0.052 


0.019 


0.38 


0.028 


0.013 



C. Large Scale Shear— Shear Correlations and Dark 
Energy Clustering 

The previous sub-section showed that while most of 
the information in the non-linear shear-shear correla- 
tions is on small scales, the information contained in 
the linear shear-shear correlations is coming mostly from 
larger scales £ < 1000. For completeness, we here ask 
whether the largest scales (£ ~ 40) actually dominate 
shear-shear information. We show, in Table IVTl the con- 
straints, recalculated as in Table U except we have ne- 
glected large-scales for the shear-shear correlations, by 
increasing the cutoff from £ min — 41 to £ min = 100. As 
the comparison of the two tables show, the constraints 
do not degrade significantly, implying that most of the 
information is at smaller angular scales (£ > 100). 

A related issue is that in scalar field models of dark 
energy (e.g. [28]), such as quintessence [iq |. the field clus- 
ters on large scales, while it remains smooth on small 
scales. This is different from a cosmological constant, 
which remains smooth on all scales, and the additional 
dark-energy fluctuations can enhance the matter fluctu- 
ations on large scales. It is interesting to ask whether this 
enhancement may be detectable through the shear-shear 
correlations [ll|. We take, as an example, the shear- 
shear auto power spectra (Cf 8 ) of the 8th. Compared 
with other bins, this has the largest comoving radial dis- 
tance, so that for fixed £, it probes the largest comov- 
ing scales, and should be most sensitive to the clustering 
effect of quintessence field. We calculate Cf 8 for both 
our fiducial model (with a cosmological constant) and 
a quintessence cosmological model with wo = —0.5, with 
all other parameters fixed. In the quintessence model, we 
use KINKFAST with the choice for the transfer function 
that includes dark energy perturbations. 

The quintessence field affects the power spectrum 
through the expansion rate, the growth rate and the 
enhancement of the matter fluctuations on large scales, 
causing the deviations of these two Cf 8 curves from each 



other. To separate the clustering effect from the other 
two effects, we artificially replace the transfer function 
in the w — —0.5 quintessence cosmological model by the 
one that excludes the effect of dark energy clustering (i.e. 
the w = —1 transfer function), and calculate Cf 8 for the 
quintessence cosmological model again. We find that the 
difference caused by dark energy clustering is unfortu- 
nately quite small, safely within the error bars. A re- 
maining issue is that the above treatment only computes 
the fluctuations in the matter density induced by dark 
energy clustering. On the other hand, the weak lens- 
ing signal is sensitive to fluctuations in the total gravi- 
tational potential, which has an additional contribution 
directly from the fluctuations in the dark energy compo- 
nent. However, we expect these two contributions to be 
of similar order of magnitude. We conclude, in agreement 
with ref. [ll| that while the shear-shear correlations can 
tell the quintessence field apart from a cosmological con- 
stant, the distinction is made purely through the effect 
on the growth rate and expansion rate, and the clustering 
of dark energy on large scales remains undetectable. 



D. Shear Power Spectrum vs. Cluster Power 
Spectrum 

Once the galaxy clusters are detected, their spatial dis- 
tribution, characterized, e.g., by the cluster power spec- 
trum (P c (fc)), readily offers another constraint on cos- 
mology. Since earlier works [2(| S3 have studied the 
complementarity of the cluster counts and their power 
spectrum, here we contrast the dN/dz + P c (k) combi- 
nation with the dN/dz + Ci combination. To perform 
this comparison, we divide the clusters according to their 
redshift into 6 bins, each with size of Az = 0.2, except 
the farthest one with size of Az = 0.3. For each bin, 
we follow Hu & Haiman [48| and compute the cluster 
power spectrum over 30 x 30 fc-space cells centered at 
fc||,fc_L = 0.005,0.010, ...,0.15 Mpc~\ where fc||,fcj_ are 
wave-numbers parallel and transverse to the line of sight, 
respectively. The methods to obtain constraints on cos- 
mological parameters from the cluster power spectrum 
are the same as those described in [2(|. Our results are 
shown in Table IVH( separately for the number counts 
(as before), the power spectrum, and their combination. 
We have simply summed the two Fisher matrices, assum- 
ing that the two measurements are independent and have 
no covariance. This assumption was implicitly made in 
previous works [1(|[!3]> but it could be justified by argu- 
ments similar to those in the previous section. The 5 th 
column in the Table shows the complementarity param- 
eter, defined analogously to that in Table HI 

A comparison of Table [Q and Table ELD reveals 
that shear-shear correlations give tighter constraints on 
f^DE; wo? w a than the cluster power spectrum, either 
by itself, or in combination with the number counts. 
In particular, the combined constraints are a factor of 
~ 1.5 — 2 better when the shear-shear correlations are 
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TABLE VII: This table replaces the shear-shear correlations 
by the cluster power spectrum. Marginalized errors are shown 
on cosmological parameters from cluster counts, cluster power 
spectrum, and their combination. The parameter n, shown 
in the 5 th column, measures the synergy between the two 
observables, as in Table 1. 





counts 


Pc(fc) 


counts+P c (fc) 


V 


AQde 


0.064 


0.0095 


0.0040 


0.18 


AQ m h 2 


0.20 


0.027 


0.0047 


0.030 


Aa 8 


0.029 


0.020 


0.0050 


0.091 


Awo 


0.080 


0.12 


0.057 


0.75 


Aw a 


1.24 


0.54 


0.23 


0.23 


AQ b h 2 


0.0010 


0.0063 


0.00097 


0.97 


An s 


0.040 


0.046 


0.013 


0.19 



TABLE VIII: This table shows the effect of excluding the high- 
est redshift galaxies from the shear measurements. Marginal- 
ized errors are shown on the cosmological parameters, as in 
Table HI except that here we have used only the first four red- 
shift bins for the shear-shear correlations, that is, only the 
source galaxies with redshift [0,1.61 are considered. 





counts 


77 


counts+77 


V 


77' 


counts+77 ! 


AfiDE 


0.064 


0.011 


0.0034 


0.11 


0.035 


0.0035 


AQ m h 2 


0.20 


0.11 


0.0095 


0.0093 


0.065 


0.0070 


Ao-s 


0.029 


0.018 


0.0042 


0.076 


0.080 


0.0044 


A wo 


0.080 


0.10 


0.046 


0.53 


0.23 


0.049 


Aw a 


1.24 


0.40 


0.17 


0.21 


1.18 


0.18 


AQ. b h 2 


0.0010 


0.032 


0.0010 


1.0 


0.020 


0.0010 


An s 


0.040 


0.12 


0.025 


0.43 


0.041 


0.014 



used. The shear-shear correlations utilize structure in- 
formation within the redshift range of [0, 3.2], while clus- 
ter power spectrum survey utilizes only that in the red- 
shift range of [0.1, 1.4] (few clusters can be detected at 
redshifts beyond this range, see [13]). In Table IVTIT1 
we show the constraints as in Table HI except that we 
have used only the first four redshift bins for the shear- 
shear correlations, that is, only the source galaxies with 
redshift [0,1.6] are considered. The Table shows that the 
constraints would typically degrade by ~ 30% if the high- 
z tail of galaxies were discarded. We conclude that about 
^half of the advantage of the shear-shear correlations 
over the cluster power spectrum comes from this high- 
z tail; the rest of the improvement is due to the fact 
that at low redshift, the shear-shear correlations (with 
Anax = 1000 corresponding to k ~ 1 at z ~ 0.2), probe 
smaller spatial scales than we used for the power spec- 
trum (fe max = O.lSMpc" 1 ). 



E. The Impact of Systematic Errors 

The above results (listed in Table H| are encouraging, 
and suggest that cluster counts will be a useful com- 



plement to the shear-shear correlations, despite the fact 
that constraint from the counts alone are weaker. How- 
ever, a general concern with this conclusion is that sys- 
tematic errors, which will inevitably degrade constraints 
from individual observables, may additionally degrade 
their synergy. Here we briefly examine some aspects of 
this question. 

First, the major concern with selecting clusters from 
their weak lensing shear alone is that projection effects 
will produce false detections (contamination) and cause 
real clusters to drop out of the sample (incomplete- 
ness) . [H, EH, El] In principle, these effects can be mod- 
eled in ab-initio simulations, but for a very large survey, 
such as LSST, the contamination and completeness has 
to be quantified to a very stringent fractional accuracy 
of N to Hi r~j 2 x 10~ 3 in order not to dominate Poisson 
errors. Alternative approaches would be to utilize other 
(optical or X-ray) data to improve the accuracy of the 
selection function (see ref. [2(| for more discussion). 

Here we note that if we restrict our analysis to in- 
creasingly high-cr shear peaks, our results should become 
increasingly realistic. This is for two reasons: (i) contam- 
ination and completeness improve rapidly as the detec- 
tion threshold (or cluster mass) is increased, [HI, 0, [49j] 
and (ii) these peaks are rare, and therefore the required 

— 1/2 

iV t t L accuracy for the selection function becomes less 
stringent, and easier to achieve in simulations. Here we 
simply examine the effect of increasing the threshold Kth, 
to quantify whether a smaller cluster sample, derived 
from higher shear peaks, is still useful. In Table ITXl we re- 
peat our calculations from Table[l] except that we replace 
the threshold K th = 4.5er n oise by n th = 10, 20, 30(T noise . 
The table shows that the number of clusters diminish 
rapidly: N total = 276,794 -> 30,554 -> 1,954 -> 205, 
respectively, as the threshold is increased. On the other 
hand, despite this decrease, cluster counts remain use- 
ful in tightening the constraints. For example, the 
most massive ~ 30,000 clusters still improve dark en- 
ergy constraints by a factor of two relative to using the 
shear-shear correlations alone. Even the most massive 
w 200 — 2, 000 clusters, which, by themselves, do not of- 
fer interesting constraints on dark energy, still improve 
the constraints when added to the shear-shear correla- 
tions. This gives us confidence that cluster counts will 
be a useful complement to the shear-shear correlations, 
despite systematic errors in the weak lensing cluster se- 
lection function. 

Another significant concern is that clusters are not 
spherical structures, and, when viewed from different 
directions, will produce a different shear. Simulations 
suggest that this can cause an irreducible scatter, and 
possibly a bias, in the relation between halo mass and 
shear [E3, [EH, [E2] In addition, fluctuations caused by 
large-scale structure along the line of sight will introduce 
a scatter. While once again these effects can be studied 
in ab-initio simulations, and may be correctable statis- 
tically [53] or by identifying foreground lensing galaxies 
and directly subtracting their lensing effect [54[, the ac- 
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TABLE IX: This table examines raising the shear-detection 
threshold. Marginalized errors are shown on cosmological pa- 
rameters, as in Table |IJ except that we adopt increasingly 
more stringent detection thresholds for the convergence. Note 
that cluster counts improve dark energy constraints when 
added to the shear-shear correlations, even for exceedingly 
high detection thresholds. 





counts 


77 


counts+77 


V 




= 10a noise ,iV total = 30,554 


A^DE 


0.10 


0.0084 


0.0046 


0.30 


AQ m h 2 


0.35 


0.049 


0.0062 


0.017 


Acrg 


0.14 


0.012 


0.0054 


0.21 


Aw 


0.28 


0.078 


0.048 


0.40 


AlOa 


2.18 


0.28 


0.14 


0.26 


AQ. b h 2 


0.0010 


0.014 


0.00099 


0.99 


An, 


0.040 


0.050 


0.017 


0.30 


Kth = 20a 


noise, JVtotal = 1,954 


A^DE 


0.083 


0.0084 


0.0065 


0.61 


Attmh 2 


0.71 


0.049 


0.0063 


0.017 


Acrg 


0.33 


0.012 


0.0084 


0.51 


Awq 


2.78 


0.078 


0.060 


0.59 


Aw a 


6.96 


0.28 


0.19 


0.45 


AQ. b h 2 


0.0010 


0.014 


0.00099 


0.99 


An, 


0.040 


0.050 


0.017 


0.31 


ftth = 30 


Tnoise , Atotal — 205 




Afi D E 


0.52 


0.0084 


0.0075 


0.81 


Afi m ii 2 


1.21 


0.049 


0.0064 


0.018 


Acrg 


0.44 


0.012 


0.010 


0.76 


Awo 


12.3 


0.078 


0.067 


0.74 


Aw a 


39.4 


0.28 


0.22 


0.65 


AQ. b h 2 


0.0010 


0.014 


0.00099 


0.99 


An, 


0.040 


0.050 


0.018 


0.33 



curacy to which the magnitude and shape of the unknown 
scatter will be reduced is not yet clear. Here we perform 
a simple exercise, and model the probability distribution 
p(n\M,z)for a dark matter halo with fixed mass M at 
redshift z to produce a smoothed convergence k to be 
given by a Gaussian, 



P (k\M,z) 



1 



27TCT K 



exp 



(« - K N FW) 2 



2al 



(34) 



where knfw is calculated by assuming an NFW density 
profile for the dark matter halo as described in §.|TT]above, 
and we assume a K = eknfw- We assume that the value 
of k at fixed mass is known ab-initio to within ~ 30%, i.e 
we adopt the fiducial value of e = 0.3. The probability 
of detecting this dark matter halo by setting a detection 
threshold of Kth is then 



P(M, z) = ierfc 



>Hh — %FW 



V2cr K 



TABLE X: This table examines the effect of scatter in the 
mass-convergence relation. Marginalized errors are shown on 
cosmological parameters, as in Table HI except that we al- 
low for an additional free parameter, e, representing a scatter 
between cluster mass and the convergence it produces. We 
assume a Gaussian distribution for this scatter, and adopt 
a prior of Ae = 0.3 when the constraints from the number 
counts are marginalized over t. 





counts 


77 


counts+77 


V 


77 ! 


counts+77 ! 


Afi D E 


0.11 


0.0084 


0.0026 


0.096 


0.013 


0.0028 


A£l m h 2 


0.35 


0.049 


0.0061 


0.016 


0.034 


0.0053 


Acrg 


0.032 


0.012 


0.0034 


0.096 


0.021 


0.0052 


Awo 


0.077 


0.078 


0.032 


0.35 


0.12 


0.035 


Aw a 


1.32 


0.28 


0.11 


0.17 


0.42 


0.14 


AQ b h 2 


0.0010 


0.014 


0.00099 


0.99 


0.010 


0.00099 


An, 


0.040 


0.050 


0.016 


0.28 


0.027 


0.012 



and equation ([I]) is modified to 

dn 



d 2 V 

N. t = AnAz——(z l ) 
dzdil 



— (M, Zl )P(M,z t )dM. 



(35) 



< 36 ) 

First, this assumed fiducial scatter increases the number 
of the total detected clusters (from 276,794 to 305,385). 
We then recompute the constraints, letting e to be an ad- 
ditional free parameter, adopting a weak prior of Ae = 
0.3. The estimated uncertainties on the cosmological pa- 
rameters from number counts after marginalizing over 
e, and its combination with shear-shear correlations, are 
listed in Table IXl A comparison with Table U reveals that 
the cluster-count constraints degrade somewhat due to 
this uncertain scatter, the combined constraint degrade 
very little. Therefore, we conclude that as long as the 
k — M relation can be characterized ab-initio to within 
~ 30%, we expect our results to remain realistic. 

Another issue is whether photometric redshift accura- 
cies will limit the constraints quoted here. In the case 
of shear tomography, the impact of redshift uncertain- 
ties has been considered in detail in refs [H, [56| . While 
our redshift bins are relatively wide, calibration of the 
photometric errors to the accuracy required to avoid de- 
grading dark energy parameter constraints will likely re- 
quire a large spectroscopic follow-up program. We refer 
the reader to refs [H^, for detailed treatments. In 
the case of cluster counts, we used 26 redshift bins in 
our analysis, effectively requiring that we know cluster 
redshifts to within Az w 0.05. This is comparable to ex- 
pected photometric redshift errors, and should be feasi- 
ble to achieve for most of the clusters, for which a secure 
identification of cluster membership can be made for a 
few galaxies. On the other hand, the use of 26 bins is 
not actually required - one expects that fewer bins are 
sufficient, since the cluster abundance varies relatively 
smoothly with redshift. 

To address this issue, in Figure[5]we show the marginal- 
ized errors on the dark energy parameters from the num- 
ber counts, as a function of the number of redshift bins 
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FIG. 5: The marginalized errors are shown on dark energy 
parameters from the cluster number counts, as a function of 
the number of redshift bins (Nt) used in the analysis. The 
bins are assumed to be equally spaced in redshift. The flatness 
of the curves for Nt > 10 shows that the full information 
content of the abundance evolution can be extracted with 
rather modest cluster redshift accuracies of Az m 0.15. 



ber galaxies in the low-mass clusters and groups at the 
detection threshold. 



V. CONCLUSIONS 

In this paper, we studied the possibility of improving 
constrains on dark energy properties by combining two 
observables - the number counts of detected galaxy 
clusters and large angular scale tomographic shear shear 
correlations - that will both be automatically available 
in a future large weak lensing survey. We showed that 
the covariance between these two observables is small, 
and argued that they can therefore be regarded as 
independent constraints on dark energy parameters. 
We used the Fisher matrix formalism to forecast the 
expected statistical errors from either observable, and 
from their combination. We found that combining the 
two observables results in an improvement on dark 
energy parameter uncertainties by a factor of 2 — 25, 
relative to using either observable alone. We have 
argued that this conclusion may survive in the face of 
systematic errors. Our results also suggest that neither 
observable may exhaust the full information content of 
a non-linear weak lensing map. 



Nt,. The flatness of the curves on the figure for Nb > 10 
shows that the full information content of the abundance 
evolution can be extracted with rather modest cluster 
redshift accuracies of Az « 0.15. This accuracy, how- 
ever, is still a factor of ~ 3 more stringent than the r.m.s. 
redshift errors expected to be available from tomogra- 
phy alone (l9l | . Figure [5] shows that with tomographic 
redshifts alone (i.e. with only Nb ~ 3 redshift bins), 
there would be no interesting constraints on dark energy 
parameters. Hence, to realize the full potential of the 
survey, it will be important to securely identify mem- 
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APPENDIX 

In this Appendix, we give the details of our calculation of the covariance between the cluster counts and the 
expansion coefficients of the E modes of the shear maps, i.e. equation (j2Uj) . 



1. Sample Variance of Number Counts 

Following [4j|] , we model the number counts of galaxy clusters in each redshift bin with independent Poisson distri- 
butions, whose mean are drawn from correlated Gaussian distributions. Given a cosmological model, the probability 
density of drawing {Nf, i = 1, ...n, n = 26} clusters, is given by 

G(M|N,S counts ) (37) 

where P{Ni\Mi) is the normalized Poisson distribution for Ni with a mean M^, and G(M|N, S counts ) is the normalized 
multi-variate Gaussian probability distribution for the Mi with mean N. Here N, N and M are 26-dimensional 
column vectors, representing the counts in the 26 redshift bins, and S counts is the 26 x 26 covariance matrix. The 
integral represents averaging over a fluctuating mean M. These fluctuations are due to large-scale fluctuations in the 
underlying matter density field. Under the assumption that clusters trace the matter density field with a linear bias, 
we have 

Mi -Ni = ViJ d 3 xWi(x( X , h))b(x)S(x( X , h))n( X ), (38) 

Where we use the comoving radial distance x as the time coordinate for the time-dependent quantities, Vi is the 
comoving volume for the i th redshift bin of cluster counts, b(x) is the cluster-averaged linear bias [13, S(x(x,n)) 
is the density contrast field of matter, and Wi{x{x,n)) is the normalized survey window function, which we model to 



F(N|N,S C 



= / d n M 



W^P(Ni\Mi) 
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be part of a spherical shell. With Wi(x(x, n)) — Ri(x)Q(n), we have 



Jf^fjA J " \_\A*maxJ (Xmiii) ] i X ^ [Xmin> Xmin] (39) 

otherwise 




&m)) = fp.d-cosM- 1 , ^[0,^^10,2.) 
I 0, otherwise 

Where, # s is the angular size of the survey region, (in our case, for 18,000 deg 2 centered on the pole at 9 = 0, 
6 S = arccos[l — if-]), and Ax 1 (= x!nax — Xmin) i s the comoving radial extent of this z th redshift bin. n(x) is the 
expected comoving number density of the detectable clusters. It is related to Ni by 



JVi = ViJ d J x^(f(x,n))n( X ). (41) 

In the limit A\ l — * 0, we have 

fli(x)-»4-*(X -*".)> ( 42 ) 

equation ([41]) reduces to equation |T]). 

The elements of the sample covariance matrix, S counts , can be calculated by (43j 

founts = _ N .^ M . _ Nj y = N i N j b{zi)b{z j )D{z i )D{z j )Y d ^i / -rR l i{k)R J dk)A 2 {k,z = 0). (43) 

Here -D(z) is the growth factor of mass fluctuations, normalized to unity today, A 2 (k,z — 0) is the present-day 
variance per Ink for the matter fluctuations, and ®g,Ru(k) are quantities related to the Fourier transform of the 
survey window function, given by 

6 £ ={v5l 1 = (44) 

] 2l+\ (1+x) d p / N| /> -| 



and 



1 



Ax 1 



^( fe ) - "a — 1 \ d X u{kx), (45) 



with Pf(x) the V order Legendre polynomials, and je(kx) the £ order spherical Bessel functions. 



2. Sample Variance of Shear 

Under the Gaussian assumption for the E mode of the shear field, the probability of drawing a set of spherical 
expansion coefficients a\ m in a given cosmological model is 

P(a|C 77 ) = G(a|0,C 77 ) (46) 

where a is column vector of all the a\ m (for our range of 41 < I < 1000 and with 8 redshift bins, this vector has 
dimension 8 x E^ 00 (21 + 1) = 8, 002, 560), is a column zero vector of the same dimension, and C 77 = S 77 + N 77 is 
the covariance matrix for the a\ m , and S 77 and N 77 are as given in Sec. Ill Bl Note that both S 77 for the shear field 
and S counts for the cluster number counts come from fluctuations in the matter distribution. Since the shear field and 
the clusters are detected in the same realization of the matter distribution, they can not be independent. 

The convergence (or E mode of the shear field) signal from the source galaxies in redshift bin b is calculated by 



10 Jo 



(47) 
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where $(x(x,n)) is the gravitational potential field. As discussed in [58|, the integral of the term that contains V 2 ^* 
is much smaller than that containing y 2 ^ everywhere except on the largest angular scales. We thus neglect this term 
in the following calculation. Expanding K b (h) in spherical harmonics, we get 

i = /^(n)^(n) + noise. (48) 

The Poisson equation reads as 

V 2 $(z(x,n)) = 47rGa 2 (x)Pm(x)S(x(x,n)), (49) 

with S(x(x,n)) in this equation representing the (non-linear) matter density contrast. Using this equation, we find 

a tm = ^rf d Xa 2 (x)Pm(x)W b ( X ) I dnY; m (n)6(x(x,n)) + noise. (50) 

The sample variance S 77 and full covariance C 77 can then be computed as the ensemble average (o^a^,^,) with 



or without the noise term (the latter leads to equation 

3. Covariance Between Number Counts and Shear 

The correlation between (Mi — Ni) and a\ m is nonzero because they depend on the same density field. For clarity, 
from now on, we add a prime to the coordinates relating to the calculation of a\ ■ The correlation between the two 
density contrast fields is given by the two-point function 



(S(x( X , n))6(x'( X ', n'))> = D( X )D( X ') \ -±±e ik <*-^ P(k, Xp = 0). (51) 



The covariance of (Mi — Ni) and a lm is then given by 



((M i -N i )a\ m ) = ^-V t I x 2 dxMx)b(x)n(x)D(x) / d* ' a 2 (x')p m (x')W b (x')D(x') 



c 2 



!0Lp(k,Xp = 0) | dil k J dtte-^' k - n 'Y* n (n>) J ^e^"6(n). (52) 

Note noise from intrinsic ellipticity of galaxies does not correlate with this fluctuation of cluster counts, so the noise 
term drops here. The integral over gives 

oo e 

J « = £ 47r/jHfc X )W(fc)e r <5 m <o, ( 53 ) 

f=0m=-f 

and the integral over SI' gives 

l n , = A-K(-ifYl m (k)j t (k X '). (54) 
We are now ready to calculate the integral over flk, which gives 

In h = 6 mQ (4ir) 2 0di(kx)Mkx')- (55) 

We next use the Limber approximation when integrating over k. Assuming that P(k,Xp = 0) is a slowly varying 
function compared to the ji(kx), we have 

I k ~ S mQ e^S( X - x')P(k = -,X P = 0), (56) 
X X 

where we have used the orthogonality property 



7T 



k 2 dkj i (k X )ji(k X ') = t^Hx - X 1 ) (57) 
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for t > 0. The integral over X ' is easy to perform, and we finally have 

{{Mi - Ni)a\ m ) = ^S m0 vSt I d X R l {x)b{xMx)a 2 {x)Pm{x)W b {x)P{k = -,x)- (58) 



recalling the evolution of the matter density, 



and of the power spectrum, 



3H 2 1 

Pm(x) = n m -^^, (59) 



A 2 {k, x ) = ^P{k,x), (60) 

by taking the limit of — > 0, we obtain the final result. 

"\tt 2 O H 2 v f 
{{Mi - Ni)a b im ) = S ma ^ i -^p-^-Q e N i b{z i )W b {z i )A 2 {k = — ,*)• (61) 

This result is quoted above in equation (|29p . and the corresponding cross-correlation coefficient is evaluated explic- 
itly in one example in Figure [TJ 

Given the above result on the cross correlation, we can write down the expression for the joint probability distribution 
for simultaneously drawing a set of iV, and a b 



P(N,a|N,C tot ) = J d n M 



.1=1 



G(T|T,C to t) (62) 



where T is a column vector that combines M and a; T is the column vector containing the mean values N and 0; 
and Ctot is the total covariance matrix, consisting of four blocks, 

(gcounts £|cross \ 
/ C cro B8 )T C77 J ( 63 ) 

Note that here C 77 is the full shear-covariance matrix, including the shear noise (in practice, we found shear noise 
to have only a small effect on the results). When the ^i t um are small, we can write the multi-variate Gaussian as a 
product of two independent Gaussians, 

G(T|T, C to t) = G(M|N, S counts )G(a|0, C 77 ) + o(£), (64) 

and the joint probability becomes separable, 

P(N, a|N, Qot) = P(N|N, S counts )P(a|C 77 ) + o(0, (65) 

justifying the assumption that cluster number counts and shear-shear correlations can be treated as independent 
cosmological probes. 



